22 research outputs found

    Chemistry-Inspired Adaptive Stream Processing

    Get PDF
    International audienceStream processing engines have appeared as the next generation of data processing systems, facing the needs for low-delay processing. While these systems have been widely studied recently, their ability to adapt their processing logics at run time upon the detection of some events calling for adaptation is still an open issue. Chemistry-inspired models of computation have been shown to ease the specification of adaptive systems. In this paper, we argue that a higher-order chemical model can be used to specify such an adaptive SPE in a natural way. We also show how such programming abstractions can get enacted in a decentralised environment

    GinFlow: A Decentralised Adaptive Workflow Execution Manager

    Get PDF
    International audienceWorkflow-based computing has become a dominant paradigm to design and execute scientific applications. After the initial breakthrough of now standard workflow management systems, several approaches have recently proposed to decentralise the coordination of the execution. In particular, shared space-based coordination has been shown to provide appropriate building blocks for such a decentralised execution. Uncertainty is also still a major concern in scientific workflows. The ability to adapt the workflow, change its shape and switch for alternate scenarios on-the-fly is still missing in workflow management systems. In this paper, based on the shared space model, we firstly devise a programmatic way to specify such adaptive workflows. We use a reactive, rule-based programming model to modify the workflow description by changing its associated directacyclic graph on-the-fly without needing to stop and restart the execution from the beginning. Secondly, we present the GinFlow middleware, a resilient decentralised workflow execution manager implementing these concepts. Through a set of deployments of adaptive workflows of different characteristics, we discuss the GinFlow performance and resilience and show the limited overhead of the adaptiveness mechanism, making it a promising decentralised adaptive workflow execution manager

    Multiple Sclerosis Brain MRI Segmentation Workflow deployment on the EGEE Grid

    Get PDF
    http://www.i3s.unice.fr/~johan/MICCAI-Grid08/pdf/MICCAIGproceedings2008.pdfInternational audienceAutomatic brain MRI segmentations methods are useful but computationally intensive tools in medical image computing. Deploying them on grid infrastructures can provide an efficient resource for data handling and computing power. In this study, an efficient implementation of a brain MRI segmentation method through a grid-interfaced workflow enactor is proposed. The deployment of the workflow enables simultaneous processing and validation. The importance of parallelism is shown with concurrent analysis of several MRI subjects. The results obtained from the grid have been compared to the results computed locally on only one computer. Thanks to the power of the grid, method's parameter influence on the resulting segmentations has also been assessed given the best compromise between algorithm speed and results accuracy. This deployment highlights also the grid issue of a bottleneck effect

    NeuroLOG: Neuroscience Application Workflows Execution on the EGEE Grid

    Get PDF
    Neuroscientists are manipulating large image data sets and complex collections of medical data. They are consumers of computing infrastructures to execute complex image analysis pipelines. In the context of the NeuroLOG project, tools and methods are deployed on top of gLite to ease the exploitation of the EGEE grid in neurosciences. The developments target medical data representation, data sets federation, secure storage, image analysis software deployment and grid workflows enactment. NeuroLOG specifically targets three neurological pathologies: brain strokes, brain tumors and multiple sclerosis. The project provides a brain medical image ontology and a metadata schema designed to represent information concerning the target pathologies. It proposes a middleware architecture by integrating existing tools (MOTEUR workflows enactor interfaced to the EGEE Grid, CORESE semantic query engine, Data Federator databases federation tool, etc) and developing missing components to address the requirements of neuroscientists. It targets the integration of data resources coming from 5 medical centers. It aims at providing a demonstrator by the end of the project (2010) that federates hundreds of data sets (hundred of GBs of data) and that can exploit the grid infrastructure to process 4 different neurological data analysis pipelines. The workflow processing engine will be interfaced to the data representation and query system to identify the data sets to be processed. The EGEE grid infrastructure and middleware is providing a foundation layer on top of which to build neurosciences applications. However, the gap between the low level functionality provided (batch processing, distributed files management...) and the applications requirements (data flows processing, medical data representation...) is large. The NeuroLOG project aims at fostering the adoption of HealthGrids in a distributed pre-clinical community. It is adopting a user-centric perspective to meet the neuroscientists expectations. The tools and methodology enable the integration of heterogeneous site data schemas and the definition of a site-centric policy. The NeuroLOG middleware bridge the EGEE grid and local resources to provide a transitional model towards HealthGrids which is respective of users desires to control resources and data. The NeuroLOG middleware is currently used to enact an application to Multiple Sclerosis patient population analysis on the EGEE grid

    Multi-infrastructure workflow execution for medical simulation in the Virtual Imaging Platform

    Get PDF
    International audienceThis paper presents the architecture of the Virtual Imaging Platform sup- porting the execution of medical image simulation workflows on multiple comput- ing infrastructures. The system relies on the MOTEUR engine for workflow execu- tion and on the DIRAC pilot-job system for workload management. The jGASW code wrapper is extended to describe applications running on multiple infrastruc- tures and a DIRAC cluster agent that can securely involve personal cluster re- sources with no administrator intervention is proposed. Grid data management is complemented with local storage used as a failover in case of file transfer errors. Between November 2010 and April 2011 the platform was used by 10 users to run 484 workflow instances representing 10.8 CPU years. Tests show that a small per- sonal cluster can significantly contribute to a simulation running on EGI and that the improved data manager can decrease the job failure rate from 7.7% to 1.5%

    El nuevo servidor Latinoamericano de Biologfa Molecular de la UCB: bo.expasy.org

    Get PDF
    ExPASy es un servidor de biologfa molecular que provee acceso a informationen proteomica a traves de un conjunto de herramientas de analisis y bases de datos dedicadas. Este servidor, desarrollado por el Instituto Suizo de Bioinformatica(SIB), es pionero en su clase y actualmente se ha convertido en una de las referencias mas consultadas por centros de investigation e industrias de biotecnologfaa nivel mundial. Sitios mirrors han sido implementados alrededor del mundo, eninstituciones academicas y de investigation, para ofrecer un acceso eficiente a loscentros de investigation y desarrollo en biologfa molecular geograficamente distribuidos.La Universidad Catolica Boliviana, mediante el Instituto de Investigation enInformatica Aplicada (IIIA), en colaboracion con el SIB, ha implementado unnuevo servidor mirror http://bo.expasy.or g para la region latinoamericana queha sido puesto a disposition de la comunidad cientffica a principios de noviembre2002.Este artfeulo tiene por objetivo presentar el contenido del sitio mirror y susposibles aplicaciones para la investigation y la industria

    Gestion du cycle de vie de services déployés sur une infrastructure de calcul distribuée en neuroinformatique

    Get PDF
    There is an increasing interest among scientific communities for sharing data and applications in order to support research and foster collaborations. Interdisciplinary domains like neurosciences are particularly eager of solutions providing computing power to achieve large-scale experimentation. Despite all progresses made in this regard, several challenges related to interoperability, and scalability of Distributed Computing Infrastructures are not completely resolved though. They face permanent evolution of technologies, complexity associated to the adoption of production environments, and low reliability of these infrastructures at runtime. This work proposes the modeling and implementation of a service-oriented framework for the execution of scientific applications on Distributed Computing Infrastructures taking advantage of High Throughput Computing facilities. The model includes a specification for description of command-line applications; a bridge to merge service-oriented architectures with Global computing; and the efficient use of local resources and scaling. A reference implementation is proposed to demonstrate the feasibility of the approach. It shows its relevance in the context of two application-driven research projects executing large experiment campaign on distributed resources. The framework is an alternative to existing solutions that are often limited to execution consideration only, as it enables the management of legacy codes as services and takes into account their complete lifecycle. Furthermore, the service-oriented approach helps designing scientific workflows which are used as a flexible way of describing application composed with multiple services. The approach proposed is evaluated both qualitatively and quantitatively using concrete applications in the area of neuroimaging analysis. The qualitative experiments are based on the optimization of specificity and sensibility of the brain segmentation tools used in the analysis of Magnetic Resonance Images of patient affected by Multiple Sclerosis. On the other hand, quantitative experiments deal with speedup and latency measured during the execution of longitudinal brain atrophy detection in patients impaired by Alzheimer's disease.L'intĂ©rĂȘt va croissant parmi les communautĂ©s scientifiques pour le partage de donnĂ©es et d'applications qui facilitent les recherches et l'Ă©tablissement de collaborations fructueuses. Les domaines interdisciplinaires tels que les neurosciences nĂ©cessitent particuliĂšrement de disposer d'une puissance de calcul suffisante pour l'expĂ©rimentation Ă  grande Ă©chelle. MalgrĂ© les progrĂšs rĂ©alisĂ©s dans la mise en Ɠuvre de telles infrastructures distribuĂ©es, de nombreux dĂ©fis sur l'interopĂ©rabilitĂ© et le passage Ă  l'Ă©chelle ne sont pas complĂštement rĂ©solus. L'Ă©volution permanente des technologies, la complexitĂ© intrinsĂšque des environnements de production et leur faible fiabilitĂ© Ă  l'exĂ©cution sont autant de facteurs pĂ©nalisants. Ce travail porte sur la modĂ©lisation et l'implantation d'un environnement orientĂ© services qui permet l'exĂ©cution d'applications scientifiques sur des infrastructures de calcul distribuĂ©, exploitant leur capacitĂ© de calcul haut dĂ©bit. Le modĂšle comprend une spĂ©cification de description d'interfaces en ligne de commande; un pont entre les architectures orientĂ©es services et le calcul globalisĂ©; ainsi que l'utilisation efficace de ressources locales et distantes pour le passage Ă  l'Ă©chelle. Une implantation de rĂ©fĂ©rence est rĂ©alisĂ©e pour dĂ©montrer la faisabilitĂ© de cette approche. Sa pertinence et illustrĂ©e dans le contexte de deux projets de recherche dirigĂ©s par des campagnes expĂ©rimentales de grande ampleur rĂ©alisĂ©es sur des ressources distribuĂ©es. L'environnement dĂ©veloppĂ© se substitue aux systĂšmes existants dont les prĂ©occupations se concentrent souvent sur la seule exĂ©cution. Il permet la gestion de codes patrimoniaux en tant que services, prenant en compte leur cycle de vie entier. De plus, l'approche orientĂ©e services aide Ă  la conception de flux de calcul scientifique qui sont utilisĂ©s en tant que moyen flexible pour dĂ©crire des applications composĂ©es de services multiples. L'approche proposĂ©e est Ă©valuĂ©e Ă  la fois qualitativement et quantitativement en utilisant des applications rĂ©elles en analyse de neuroimages. Les expĂ©riences qualitatives sont basĂ©es sur l'optimisation de la spĂ©cificitĂ© et la sensibilitĂ© des outils de segmentation du cerveau utilisĂ©s pour traiter des Image par Raisonnance MagnĂ©tique de patients atteints de sclĂ©rose en plaques. Les expĂ©riences quantitative traitent de l'accĂ©lĂ©ration et de la latence mesurĂ©es pendant l'exĂ©cution d'Ă©tudes longitudinales portant sur la mesure d'atrophie cĂ©rĂ©brale chez des patients affectĂ©s de la maladie d'Alzheimer

    Scalability and Locality Awareness of Remote Procedure Calls: An Experimental Study in Edge Infrastructures

    Get PDF
    International audienceCloud computing depends on communication mechanisms implying location transparency. Transparency is tied to the cost of ensuring scalability and an acceptable request responses associated to the locality. Current implementations, as in the case of OpenStack, mostly follow a centralized paradigm but they lack the required service agility that can be obtained in decentralized approaches. In an edge scenario, the communicating entities of an application can be dispersed. In this context, we focus our study on the inter-process communication of OpenStack when its agents are geo-distributed. More precisely, we are interested in the different Remote Procedure Calls (RPCs) implementations of OpenStack and their behaviours with regards to three classical communication patterns: anycast, unicast and multicast. We discuss how the communication middleware can align with the geo-distribution of the RPC agents regarding two key factors: scalability and locality. We reached up to ten thousands communicating agents, and results show that a router-based deployment offers a better trade-off between locality and load-balancing. Broker-based suffers from its centralized model which impact the achieved locality and scalability

    Services lifecycle management using distributed computing infrastructures in neuroinformatics

    No full text
    L intĂ©rĂȘt va croissant parmi les communautĂ©s scientifiques pour le partage de donnĂ©es et d applications qui facilitent les recherches et l Ă©tablissement de collaborations fructueuses. Les domaines interdisciplinaires tels que les neurosciences nĂ©cessitent particuliĂšrement de disposer d une puissance de calcul suffisante pour l expĂ©rimentation Ă  grande Ă©chelle. MalgrĂ© les progrĂšs rĂ©alisĂ©s dans la mise en Ɠuvre de telles infrastructures distribuĂ©es, de nombreux dĂ©fis sur l interopĂ©rabilitĂ© et le passage Ă  l Ă©chelle ne sont pas complĂštement rĂ©solus. L Ă©volution permanente des technologies, la complexitĂ© intrinsĂšque des environnements de production et leur faible fiabilitĂ© Ă  l exĂ©cution sont autant de facteurs pĂ©nalisants. Ce travail porte sur la modĂ©lisation et l implantation d un environnement orientĂ© services qui permettent l exĂ©cution d applications scientifiques sur des infrastructures de calcul distribuĂ©, exploitant leur capacitĂ© de calcul haut dĂ©bit. Le modĂšle comprend une spĂ©cification de description d interfaces en ligne de commande ; un pont entre les architectures orientĂ©es services et le calcul globalisĂ© ; ainsi que l utilisation efficace de ressources locales et distantes pour le passage Ă  l Ă©chelle. Une implantation de rĂ©fĂ©rence est rĂ©alisĂ©e pour dĂ©montrer la faisabilitĂ© de cette approche. Sa pertinence est illustrĂ©e dans le contexte de deux projets de recherchĂ© dirigĂ©s par des campagnes expĂ©rimentales de grande ampleur rĂ©alisĂ©s sur des ressources distribuĂ©es. L environnement dĂ©veloppĂ© se substitue aux systĂšmes existants dont les prĂ©occupations se concentrent souvent sur la seule exĂ©cution. Il perme la gestion de codes patrimoniaux en tant que services, prenant en compte leur cycle de vie entier. De plus, l approche orientĂ©e services aide Ă  la conception de flux de calcul scientifique qui sont utilisĂ©s en tant que moyen flexible pour dĂ©crire des applications composĂ©es de services multiples. L approche proposĂ©e est Ă©valuĂ©e Ă  la fois qualitativement et quantitativement en utilisant des applications rĂ©elles en analyse de neuroimages. Les expĂ©riences qualitatives sont basĂ©es sur l optimisation de la spĂ©cificitĂ© et la sensibilitĂ© des outils de segmentation duc cerveau utilisĂ©s pour traiter des Images par RĂ©sonnance MagnĂ©tique de patients atteints de sclĂ©rose en plaques. Les expĂ©riences quantitatives traitent de l accĂ©lĂ©ration et de la latence mesurĂ©es pendant l exĂ©cution d Ă©tudes longitudinales portant sur la mesure d atrophie cĂ©rĂ©brale chez des patients affectĂ©s de la maladie d Alzheimer.There is an increasing interest among scientific communities for sharing data and applications in order to support research and foster collaborations. Interdisciplinary domains like neurosciences are particularly eager of solutions providing computing power to achieve large scale experimentation. Despite all progresses made in this regard, several challenges related to interoperability, and scalability of Distributed Computing Infrastructures are not completely resolved though. They face permanent evolution of technologies, complexity associated to the adoption of production environments, and low reliability of these infrastructures at runtime. This work proposes the modeling and implementation of a service-oriented framework for the execution of scientific applications on Distributed Computing Infrastructures taking advantage of High Throughput Computing facilities. The model includes a specification for description of command-line applications, a bridge to merge-service-oriented architectures with Global computing, and the efficient use of local resources and scaling. A reference implementation is proposed to demonstrate the feasibility of the approach. It shows its relevance in the context of two application-driven research projects executing large experiment campaign on distributed resources. The framework is an alternative to existing solutions that are often limited to execution consideration only, as it enables the management of legacy codes as services and takes into account their complete lifecycle. Furthermore, the service-oriented approach helps designing scientific workflows which are used as a flexible and way of describing application composed with multiple services. This approach proposed is evaluated both qualitatively and quantitatively using concrete applications in the area of neuroimaging analysis the qualitative experiments are based on the optimization of specificity and sensibility of the brain segmentation tools used in the analysis of Magnetic Resonance Images of patient affected by Multiple Sclerosis. On the other hand, quantitative experiments deal with speedup and latency measured during the execution of longitudinal brain atrophy detection in patients impaired by Alzheimer s disease.NICE-BU Sciences (060882101) / SudocSudocFranceF

    A Chemistry-inspired Programming Model for Adaptive Decentralised Workflows

    Get PDF
    In this paper, we devise a chemistry-inspired programming model for the decentralised execution of scientific workflows, with the possibility of dynamically adapting its shape when its initial specification fails to reach the user's requirements or simply to run due to external conditions. We describe a decentralised architecture to support the model and cover its implementation in the GinFlow software prototype
    corecore